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A {y,k,t) covering design, or covering, is a family of A;-subsets, called blocks, chosen from a v-set, such that 
each f-subset is contained in at least one of the blocks. The number of blocks is the covering's size, and the 
minimum size of such a covering is denoted by C{y,k,t). It is easy to see that a covering must contain at least 
(f ) / (f) hlocks, and in 1985 Rodl |^ proved a long-standing conjecture of Erdos and Hanani that for fixed 
k and /, coverings of size (J) / (,) (1 +o(l)) exist (as v oo). 

An earlier paper by the first three authors |Q] gave new methods for constructing good coverings, and gave 
tables of upper bounds on C(v, k, t) for small v, k, and t. The present paper shows that two of those constructions 
are asymptotically optimal: For fixed k and t, the size of the coverings constructed matches Rodl's bound. The 
paper also makes the o(l) error bound explicit, and gives some evidence for a much stronger bound. 



1. INTRODUCTION 



2.1. Greedy Coverings 



Let the covering number C(v,A;,f ) denote the smallest num- 
ber of fc-subsets of a v-set that cover all f-subsets. The 
best general lower bound on C{v,k,t), due to Schonheim [^, 
comes from the following inequality: 

Theorem 1. 
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The best general upper bound on C{v,k,t) is due to Rodl |Q|: 
The density of a covering is the average number of blocks 
containing a f-set. The minimum density is C(v,fe,f)(*) / (J), 
and is obviously at least 1 . Rodl showed that for k and f fixed 
there exist coverings with density 1 +o(l) as v gets large. 

This paper shows that two of our constructions [Q] match 
the bound of Rodl's theorem. One of the constructions gives 
an easier proof of the theorem than Rodl's original proof [^]. 
The other construction provides a computationally efficient 
version of Rodl's theorem. In Section H we review the two 
constructions. In Section ^ we show that the first one, which 
uses a greedy algorithm, is asymptotically optimal. And in 
Section Q we show that the second one, which constructs 
an induced covering from a finite-geometry covering, is also 
asymptotically optimal, and that it is computationally efficient 
as well. 

Theorem H (in Section ||) is a special case of a main result of 
the fourth author |^; Rodl and Thoma [^] gave another proof 
of that result. We present the proof here to keep the paper 
self-contained and to provide an explicit error bound for use 
in Section |[ 

2. COVERING CONSTRUCTIONS 



Algorithm 1. Random Greedy {v,k,t) Covering 

1 . Fix a random ordering of the ^-sets of a v-set. 

2. Choose the earliest A:-set containing no already-covered 
f-set. 

3. Repeat Step ^ until no A:-set can be chosen. 

4. Cover the remaining f-sets with one A:-set each. 

This greedy algorithm is a little different from our previ- 
ous one. That algorithm uses one of four possible orderings 
in Step |l]: lexicographic, colex. Gray code, or random. Also, 
it chooses in Step ^ the earliest A:-set that contains the most 
still-uncovered f-sets; thus it continues with Steps || and|| in- 
stead of cutting out to Step^. That algorithm produces slightly 
better coverings in practice, but is harder to analyze than the 
algorithm here. 



2.2. Induced Finite Geometry Coverings 

The A:-flats of an affine or projective geometry form a cover- 
ing. For this paper, we restrict our attention to the hyperplanes 
of an affine geometry, which form an optimal covering: 

Theorem 2. For a prime power q and integer t > I, the hyper- 
planes of the affine geometry AG{t,q) are a {q',q'^^,t) cover- 
ing of size 



C{q',q'-\t) = 



q'+'-q 



q-l 



The density of such a covering is 



q'+^~qfq'-^ 



q-l \ t 
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Here we summarize two methods for constructing asymp- 
totically optimal coverings. Our previous paper [Q| gives more 
details, as well as computational results for small v, k, and t. 



Algorithm 2. Induced {v,k,t) Covering 

1 . Choose a prime p with p' > v, and an integer £, as spec- 
ified later. 
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2. Precompute {£',k,t) coverings, for £ < £' < 9£, using 
Algorithm |l]. 

3. Choose V points of the AG(f , p) at random. 

4. For each hyperplane, find its intersection with the 
V points; let £' be the size of the intersection. 

(a) lf£<£'< 9£, add the blocks of the {£',k,t) cov- 
ering on those points to the {v,k,t) covering. 

(b) lf£'<£ or £' > 9£, trivially add (^') blocks to the 
{v,k,t) covering. 

The new blocks each have k elements, and together they 
cover all f-sets, so they form a {v,k,t) covering. The blocks 
of the affine covering and their intersection with the v-set may 
quickly be computed by solving linear equations over GF(^). 

This construction, too, differs slightly from our earlier ver- 
sion [^. In that paper, we construct {£',k,t) coverings for 
all ^' < V by whatever construction gives the best results, and 
then always use Step Q That results in better coverings in 
practice, but is harder to analyze. 

3. GREEDY COVERINGS AND RODL'S BOUND 

The usual proofs of Rodl's theorem (Rodl or Alon and 
Spencer seem nonconstructive; however, they are actu- 
ally analyses of a covering algorithm, similar to the greedy 
algorithm with random ordering, that constructs a covering in 
two steps. First, it chooses a sequence of Rodl nibbles, each 
of which is a small, random collection of fc-sets that do not 
contain any f-set contained in any previous Rodl nibble. Sec- 
ond, when there is no longer room for a nibble, it chooses a 
separate A:-set for each remaining uncovered f-set. 

The main difference between the fc-sets chosen in the se- 
quence of Rodl nibbles and those chosen by the greedy algo- 
rithm in Steps ^ and |^ is that two ^-sets in the same nibble may 
intersect each other in a f-set. This difference seems small, 
hence it is natural to conjecture that the greedy algorithm, too, 
meets Rodl's bound. It does: 

Theorem 3. For fixed k and t, the greedy algorithm with 
random ordering produces a covering with expected density 
1 + o ( 1 ) a.? V — > oo. 

The proof of Theorem || will proceed in several steps, along 
the lines of Spencer . 

3.1. The Continuous Model 

Model the execution of the greedy algorithm as a Poisson 
process; that is, a given k-ssi is chosen between time T and 
T + 5 with probability asymptotic to 5/ as 5 ^ 0, and 
the probabilities of any two A;-sets being chosen in any two 
time intervals are independent. The process begins at time 
and lasts forever. If a ^-set chosen by the process at some 
time T contains any previously covered f-set, \l fails at time T, 
otherwise it succeeds and its f-sets are considered covered af- 
ter time T. The ^-set thus fails at any time subsequent to T it 
is chosen. 



The ordering determined by the first-choosings of the k- 
sets in this process corresponds to the random ordering of the 
^-sets in the greedy algorithm, and the A:-sets that have suc- 
ceeded at time infinity correspond to the A:-sets chosen by the 
greedy algorithm just prior to Step ^. Thus to prove the the- 
orem it suffices to show that, at time infinity of the Poisson 
process, a given f-set is covered with probability asymptotic 
to 1. (Since if the proportion of f-sets covered at that point of 
the greedy algorithm goes to 1 then so does the density of the 
eventual covering.) We actually find the limit of this proba- 
bility as V ^ CO for every fixed t, and we show that this limit 
goes to 1 as T ^ oo. 

Fix a time T and a f-set T . Based on the Poisson process 
above, we either define the dependence tree of (t, T) or else 
declare it to be aborted. The tree is rooted, and has f-vertices 
and ^-vertices — ^begin at time T with the tree consisting only 
of its f-vertex root (T,r), and we examine ^-sets chosen by 
the process, proceeding backwards in time from T toward 0. 

There are three cases for a ^-set K* chosen at some time t*: 
if K* does not contain any T' already in the tree then do noth- 
ing; if it contains two or more such T' then declare the tree 
to be aborted; if (the important case) it contains precisely one 
such T' then add {l* ,K*) as a child of {i' ,T') and for every 
f-set T* C K* except T' add {x* J*) as a child of {x\K*). 
We will say that T has given birth to K* at time T*, and K* 
immediately gives birth to all the T' nodes. 

The tree, if defined, is finite; a child of a f-vertex is a k- 
vertex and vice versa. We label each vertex as follows. A f- 
vertex is covered if at least one of its children is accepted, else 
it is uncovered; a A:-vertex is accepted if none of its children is 
covered, else it is rejected. Thus a childless (leaf) f-vertex is 
uncovered, and a unique labeling is defined inductively from 
the leaves up. 

Example 1. Takef = 2; k = 3\ v=10"'; T = 4.3; r = {l,2}. 
Suppose {1,2,3} is chosen at time 3.7 and {2,3,4} at time 1.2 
and these are the only relevant chosen sets. The dependence 
tree of (4.3, {1,2}) is shown in Figure |l|. Two of the leaves 
(1.2, {2,4}) and (1.2, {3.4}) are uncovered, thus their par- 
ent (1.2, {2,3,4}) is accepted, so (3.7, {2, 3}) is covered and 
(3. 7, {1,2, 3}) is rejected and finally (4.3, {1,2}) is uncov- 
ered. In the corresponding Poisson process, {2,3,4} succeeds 
at time 1.2, thus {1,2,3} fails at time 3.7, so no 3-set covering 
{ 1 , 2 } is accepted by time 4.3. 

This example is consistent with the claim below. 

Claim. Suppose the dependence tree of (t, T) for some 
X and T is defined. Then (t, T) is covered if and only if T is 
covered by the Poisson process. 

Proof of claim. If T is covered in the Poisson process by a 
A;-set K, then K succeeded at some time X*. Thus no ^-set con- 
taining any of the f-sets covered by K was chosen before T*, 
and {x* ,K) is accepted, hence (t, T) is covered. Conversely, 
suppose that {x,T) is covered in its dependence tree. Then 
it has an accepted child. It might have several accepted chil- 
dren, but since the tree is defined, the ^-sets of these children 
can intersect only in T . The earliest such A:-set succeeded, so 
it covers T . That establishes the claim. □ 
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Figure 1 : Example of a dependence tree. 



3.2. The Idealized Tree 

The process above is still difficult to analyze directly, so we 
will define for a fixed T an idealized process and an idealized 
tree, analogous to the Poisson process and dependence tree. 
We will show that the idealized trees behave like the depen- 
dence trees, and then find the probability that the root of an 
idealized tree is covered. 

The idealized tree has f-vertices and A;-vertices, and con- 
sists at time T just of a f-vertex root. Again, time goes back- 
wards, from T to 0. In the interval from Tj to Tj — 5 each 
f-vertex has probability asymptotic to 5 of giving birth to a 
A;-vertex, which then instantly gives birth to D = (*) — 1 new 
f-vertices. In a length 5 interval each f-vertex has on average 
5D grandchildren (also f-vertices), so the expected number of 
f-vertices goes up by a factor of 1 + 5D. The expected number 
of f-vertices at time is thus (1 + 5Dyl^ = e'^^(l + 0{8)) as 
5^0, hence with probability 1 the idealized tree is finite. 
The notions of covered, uncovered, accepted, and rejected are 
defined on it as before. 

We claim that the limit distribution of the dependence tree 
of (T,r) as V ^ oo is the distribution for the idealized tree. 
Consider a fixed idealized tree at time T, and look at the de- 
pendence tree of (t, T) from time Tj to Tj — 5 given that at Tj 
it matches the idealized tree. The number of f-sets in the tree 
is 0{e^^), with probability asymptotic to 1, so the number 
of A;-sets that contain more than one f-set already in the tree 
is (9(e^^^v^'^'^'), and thus the probability of aborting (i.e.. 



that some such A;-set is chosen) is 0{5e 



,2tD,,-1 



) . Therefore 



the total chance of aborting throughout the length T interval 



is (9(Te- 



2tD,,-1 



) = o(l) for T < (Inv) /(2 + e)D, for any fixed 



e>0. 

For each T' in the tree, the number of A:-sets that contain T' 
and no other f-set in the tree is asymptotically (^ij), so T' has 
a (A;-vertex) child with probability asymptotic to 5, as in the 
idealized version. Hence the two distributions are the same. 



as claimed. 

Now we compute the probability P(t) that the root of an 
idealized tree at time T is uncovered. In the interval from T 
to T — 5 of an idealized process, a f-vertex either does or does 
not give birth, with probabilities asymptotic to 5 and 1—5 
as 5 ^ 0. In the former case, a ^-vertex child is accepted 
with probabiHty P{x — d)^, because each f-vertex grandchild 
has independent probability P{t — 5) of being uncovered at 
time T — 5, and thus is rejected with probability 1 — P(t — 5)^. 
Hence 

P(t) - 5{l-P(T~5f)P(T~5) + il-5)P{T-5). 

SoP(t-5)-P(t) - 5P(t-5)^+\ which leads to the dif- 
ferential equation P(t)' = — P(t)^+' with the initial condition 
P(0) = 1. The solution is 

P(t) = (tD+1)-i/^. 

In particular limT^oo^'(T) — 0, so the root of an idealized tree 
at time infinity is covered with probability asymptotic to 1. 
Therefore, at time infinity of the Poisson process, a given f-set 
is covered with probability asymptotic to 1, and Theorem || is 
established. 



3.3. Estimating The Error Term 

The proof above shows that the greedy covering is optimal, 
but we have not estimated the error term. We conclude this 
section by giving a weak estimate, along with some evidence 
for a stronger conjecture. 

Consider the state of the algorithm at time T = (9(logv). 
First, notice that at this time of the Poisson process, the ex- 
pected number of A;-sets chosen is (9(v'logv). Thus in the 
greedy algorithm it suffices to examine just 0{v' logv) random 
^-sets before cutting out to Step 0. It takes only 0{v'\ogv) 



ln(5-l) 







1 1 1 

■: — 


1 1 1 


1 













^ 

^ 




1 








2 




1 1 1 


1 1 1 






3 


4 5 



In I' 

Figure 2: Average density 5 of random greedy coverings. 

expected time and O(v') space to generate those A:-sets (Bras- 
sard and Kannan [^), so this early abort strategy dramatically 
speeds up the algorithm, at negligible cost to the density of 
the covering: 

Corollary 1. The early-abort greedy algorithm! produces a 
covering with expected density 1 +o(l) in time (9 (v' logy). 

Second, at time T = (lnv)/(2 + e)D for any fixed £ > 
0, the probability of a f-set being uncovered is P(t) = 
0((logv)"i/'D). Thus: 

Corollary 2. The expected density of a covering produced 
by the random greedy algorithm is 1 + 0((logv)^'/^), where 

This bound is pessimistic. Figure ^ gives log-log plots for 
several {k^t) pairs, based on 1000 random greedy coverings 
per (v,A;,f) triple for v < 50, and 10^^^ such coverings for 
V > 50. The apparent asymptotic linearity of the plots sug- 
gests that the expected density of a random greedy covering 
for k and t fixed is 1 + 0(v^"), for some positive a = a{k,t) 
as V oo. 

To estimate a for each of the curves in Figure ^, we used 
the tails of the curves (100 < v < 150) for a least-squares fit 
to a straight line. That gave us rough estimates for the slopes 
—a{k,t), as indicated in Table |. Those values suggest: 

Conjecture. The expected density of a covering produced by 
the random greedy algorithm is 1 + ©(v^'*^^''/^), where D = 

0-1- 

The following argument, though far from a proof, supports 
the conjecture. 
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Table I: Estimates for Oc{k,t) 



Heuristic argument. Let a — {k~t)/D. The conjecture is 
equivalent to the statement that there are 0(v'^") expected 
f-sets not covered by a random greedy packing. (The first 
three steps of Algorithm [l| constitute the random greedy pack- 
ing algorithm.) So consider the f-uniform hypergraph whose 
edges are the f-sets still uncovered during the packing algo- 
rithm. Assume that this hypergraph looks like a random hy- 
pergraph with the same number of edges, and assume that the 
packing algorithm has managed to leave just Cj v'^"(l +o(l)) 
edges in the hypergraph, for some positive constant Cy. We 
show that a positive fraction of these edges — that is, 0(v'^") 
in all — hence can never be covered by the packing; this pro- 
vides the £2(v^") lower bound of the conjecture's error term. 

Under the stated assumptions, the probability p that a given 
edge exists in the hypergraph is asymptotic to Cjf! v^" = 
C2V^", and the probability, for a given edge in the hyper- 
graph and a given A:-set containing that edge, that the other 
(^) — 1 = D edges on those k vertices also exist is p^. There- 
fore the expected number of A:-cliques that contain the given 
edge is asymptotic to p^ v^^' / {k — t)\ = CgV^^'^^v*^^' = C3, a 
positive constant. But this number of A;-cliques is Poisson dis- 
tributed, so is zero with probability asymptotic to also a 
positive constant, thus a positive fraction of the edges are con- 
tained in no A:-clique, as claimed. The matching (9(v'^") upper 
bound follows from similar reasoning, and that completes the 
argument. It, together with our empirical data, makes the con- 
jecture quite compelling. □ 

4. INDUCED COVERINGS AND RODL'S BOUND 

While the greedy algorithm produces good coverings, it 
works in time and space ©(v*^). These can be reduced to time 
(9(v'logv) and space (9(v') using the early abort strategy of 
Corollary but for larger values of v, k, and f, the induced 
covering algorithm is more practical, because it is faster. 

Theorem 4. For fixed k and t the expected density of an in- 
duced covering is 1 + o ( 1 ). 

Proof. For Step [l| of Algorithm ^ choose £ = |v^^'/', and 
choose the prime p such that 

4£ < — < U. 
P 

Such a prime exists by Bertrand's Postulate, which states that 
there is always a prime between n and 2n. These choices en- 
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sure that p' > v, and that the affine [p' ^p' ^f) covering by 
hyperplanes has density 1 + 0{v^^l'). 

By Corollary ||the precomputed {i\k,t) greedy coverings 
of Step ^ have expected density 1 + (9((log v)^'/^). So by 
running 0{log£) trials per precomputed covering, we can en- 
sure, with probability greater than, for example, 1 — 1 /£, that 
all precomputed coverings have density 1 + 0{{logv)^^^'^). 

Now select the v-set V as a random subset of the points in 
the affine covering, and consider a fixed f-set TofV. There 
are, on average, 1 + 0{v^^^') hyperplanes containing T; let P 
be one of them. The size of the intersection of V and P T 
has a hypergeometric distribution from to v — f with mean 

p'-t 

For /? > 5 we have 

{v-t)/2p<M <{v-t)/p, 

thus 2£ < M < M hy our choice of p. So the probability that 
the size of the intersection is at most £ or at least 9i is 0{e^'^^) 
for some c > 0. 

This intersection, together with T itself, is replaced in the 
induced covering by an {£',k,t) covering. If £ < £' < 9£, then 
this covering has density 1 + (9((logv)^^/^). If £' is outside 
this range, the covering has density (*) , but the probability of 
this event is (9(e^''), so the total expected number of A:-sets 
containing T coming from a given hyperplane containing T 
is 1 + 0((log v)^'/^), and the total expected number coming 
from all such hyperplanes is 

(1 +0((logv)-i/^))(l +0(v-i/')) = 1 +0((logv)-'/«). 



Thus the expected density of the induced covering is 1 + 
O((logv)-i/0). □ 



Corollary 3. The induced covering algorithm runs in time 
and space 0{v'). 

Proof. By Corollary precomputation takes time 

C»(f+Mog2£), which is O(v') by our choice of £. The 
number of hyperplanes is 0{p^) = 0{v) by our choice of p, 
so the time to compute the affine geometry is 0{v^) = 0{v'). 
For each hyperplane, the work to find the intersection and 
convert it into an {£\k,t) covering will vary, but the time 
per block is constant. Hence the total time and space of the 
algorithm is dominated by the size of the {v,k,t) covering, 
which is also 0{v'). □ 



Corollary 4. The induced covering has expected density 1 + 
0((l0gy)-l/O). 

Furthermore, if, as we conjecture, the greedy covering has 
expected density 1 + 0(v^'^'''^'^/^), then the expected den- 
sity of the induced covering improves to 1 + C>(v^('^^'^/^) + 
0(v-i/') = l+0(v-('^-')/^). 

The best way to use the induced covering algorithm in prac- 
tice is to first find or make a large table of good coverings with 
small parameters using many different methods, and then use 
these for the {£' ,k,t) coverings. We used that strategy to pro- 
duce the induced coverings of our earlier paper [0]. 
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